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1 BSTBACT 


' This  thesis  presents  a  design  and  partial  implementation 
of  a  program  family  cf  extended  pretty  printers.  Factors 
that  influence  the  readability  (perception)  and  understand- 
ability  (cognition)  of  computer  programs  are  indentified, 
previous  work  is  reviewed,  and  new  solutions  are  suggested. 
Extensions  to  the  previous  pretty  printer  designs  include  a 
capability  to  selectively  display  levels  of  control  cf  a 
program.  In  order  to  accommodate  different  computer 
languages  and  to  allow  for  several  secondary  functions,  a 
family  of  pretty  printers  is  designed.  This  design  facili¬ 
tates  easy  extension,  contraction  and  modification. 
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I.  IS  TBODUCTION 


Programs  are  written  to  be  read  and  understood  by 
people.  The  textual  representation  should  be  such  that  it 
is  easy  to  read.  That  is,  the  representation  cf  the  program 
should  be  such  that  it  reduces  the  visual  burden  on  the  user 
and  allows  him  to  develop  and  exploit  visual  clues  to  aid  in 
reading.  In  addition,  the  text  of  the  program  should  be 
designed  so  that  it  is  easy  for  the  reader  to  grasp  the 
meaning  cf  the  program:  that  is,  the  representation  of  the 
program  should  help  the  reader  understand  the  program. 

Fifteen  years  age  Dijkstra  argued  that  "...  our  intel¬ 
lectual  powers  are  rather  geared  to  master  static  relations 
and  cur  powers  to  visualize  processes  evolving  in  time  are 
relatively  poorly  developed.  For  that  reason  we  should  do 
(as  wise  programmers  aware  of  our  limitations)  our  utmost  to 
shorten  the  conceptual  gap  between  the  static  program  and 
the  dynamic  process,  to  make  the  correspondence  between  the 
program  (spread  out  in  text  space)  and  the  process  (spread 
out  in  time)  as  trivial  as  possible."  [Hef.  16].  There  is 
an  additional  conceptual  gap  between  the  program  spread  out 
in  text  and  how  we  represent  and  manipulate  the  static 
program  and  its  dynamic  process  in  our  minds.  Here  also  we 
should  try  tc  narrow  the  conceptual  gap  so  that  the  pregram 
is  easy  tc  read  and  tc  understand. 

In  the  computer  science  literature,  readability  and 
understandability  are  often  used  interchangably .  Readability 
is  related  tc  physical  conditions,  for  instance,  the  size, 
type  font,  color,  and  clarity  of  characters,  proper  indenta¬ 
tions,  and  the  spacing  between  lines.  Understandability  is 
related  tc  psychological  conditions,  for  instance,  pattern, 
memory,  logic,  and  repetition  learnings.  Precisely  speaking. 


readability  means  good  perception  and  understandabilty  means 
good  cognition.  The  system  that  will  be  designed  in  this 
thesis  will  s“sk  to  improve  both  readability  and  understand- 
ability  by  means  of  reformatting  computer  programs  and 
presenting  the  user  with  alternative  representations  tc  aid 
understanding. 

There  is  evidence  to  show  that  readability  and  under- 
standability  of  computer  programs  is  an  important  issue  that 
is  directly  related  to  programmer  productivity.  Although 
this  has  been  recognized  for  some  time,  further  improvements 
in  the  textual  representation  of  computer  programs  are 
possible.  This  thesis  will  review  the  previous  work,  analyze 
the  remaining  problems,  and  propose  new  solutions. 


1 1 


* 


II.  EXTENDED  PRETTY  PRINTED 


A.  B1CK6B0DND 

In  a  study  of  commerical  programming  practices,  Elshoff 
[Ref.  5]  found  that  most  programs  were  poorly  written.  They 
were  very  large,  extremely  difficult  to  read  and  understand, 
and  mere  complex  than  necessary.  Furthermore,  the  sxudy 
determined  that  programming  language  usage  was  poor  and 
inconsistent.  The  results  of  the  survey  by  Lientz  [Ref.  6] 
show  that  the  quality  of  programming  is  a  generally 
perceived  problem.  There  has  been  a  major  effort  to  improve 
programming  practices.  But  there  still  exist  many  programs 
that  are  difficult  tc  read  and  understand  and  yet  they  must 
regularly  be  corrected  and/or  modified. 

There  are  many  factors  connected  with  the  readability 
and  understandabilit y  of  a  computer  program.  The  reader's 
familiarity  with  the  program,  knowledge  of  the  application 
area,  and  cwn  programming  style  are  important  factors  that 
are  mostly  independent  of  the  program  [Ref.  4],  This  thesis 
is  concentrated  cn  the  representation  of  program  text  that 
impacts  its  readability  and  understandabilit y.  A  readable 
program  always  seems  to  exhibit  a  common  set  of  properties 
[Ref.  8]  [Ref.  9]  [Ref.  10].  The  program  is  well  commented. 
The  logical  structure  of  the  program  is  constructed  from  a 
common  set  of  single-entry  single-exit  flow  of  control 
units.  Variable  names  ar9  mnemonic  and  references  to  them 
localized.  The  program's  physical  layout  makes  the  salient 
features  of  the  algorithm  that  is  implemented  stand  out 
(Ref.  14], 


Sinc€  abstraction  is  an  important  mechanism  that  people 
use  to  understand  programs,  the  suppression  of  details  in  a 
program  can  aid  understanding.  Modern  design  methodologies 
include  top  down  design  using  stepwise  refinement.  In  this 
methodology,  the  programmer  designs  successive  levels  of  the 
program.  These  levels  are  visible  during  the  design  but  are 
often  not  visible  in  the  final  program.  The  understand- 
ability  of  a  program  can  be  improved  by  making  the  levels  of 
the  program  structure  visible.  It  is  true  that  a  program 
may  have  all  these  properties  and  still  be  unreadable  and 
not  understandable;  however,  the  readability  and  urder- 
standability  of  a  program  are  certain  to  suffer  when  it 
lacks  one  or  more  of  the  the  properties  [Ref.  14]. 

B.  DEFINITION 

Hubin  [Ref.  14]  defined  a  pretty  printer  as  follows: 
"It  is  a  software  tool  to  format  programs  to  make  them 
easier  to  read  and  understand."  The  extended  pretty  printer 
can  be  defined  as:  a  software  tool  to  improve  readability 
and  understandability  by  adding  level  documentation, 
commenting  and  reformatting.  These  additional  extensions  to 
pretty  printers  will  aid  people  in  understanding  the  program 
by  making  more  visible  the  structure  of  the  program  and 
supporting  the  viewing  of  the  levels  of  the  program.  The 
primary  function  of  an  extended  pretty  printer  is  to  add 
some  level  documentation  and  comments,  to  insert  spaces  and 
linefeeds  between  tokens  -  character  strings  -  and  to  decide 
where  and  hew  to  break  lines  that  are  too  long  to  fit  on  the 
output  medium. 


C.  6CA1S 


The  methods  for  improving  the  readability  and  under- 
standability  of  a  program  use  a  set  of  specific  transforma¬ 
tions  that  can  be  applied  to  the  program  text.  The 
following  program  tranforma  tions  can  be  done  by  an  extended 
pretty  printer. 

1 .  Reform  at 

The  consistent  formating  of  programs  is  very  impor¬ 
tant.  Elshoff  [Hef.  14]  said  "Just  as  paragraphing  and 
sectioning  help  written  English,  so  can  indentation, 
keyword  positioning,  and  logical  grouping  aid  a  programming 
language.".  Those  jobs  can  be  done  automatically  by  a 
pretty  printer.  It  will  allow  the  program  to  be  read  more 
easily. 

2.  Add  Level  Structure  Documentation 

In  writing  about  his  experments  on  program  compre¬ 
hension,  Shneiderman  [Bef.  17]  said  "Instead  of  absorbing 
the  program  cn  a  character-by-character  basis,  programmers 
recognize  the  function  of  groups  of  statements  and  then 
piece  together  these  chunks  to  form  ever  larger  chunks  until 
the  entire  program  is  comprehended."  This  experiment 
suggests  that  the  level  documentation  (chunks)  of  a  program 
will  help  the  understandabi lity  of  the  program. 

3  •  Standardization 

Standardization  contributes  understandability  of  a 
program.  To  understand  this,  it  is  helpful  to  know  the 
source  of  the  expert  programming’s  capacity.  The  primary 
piece  of  direct  behavioral  evidence  for  this  is 
Shneiderman 's  replication  [Bmf.  26]  for  programming  of  Chase 
and  Simon's  classic  study  on  memory  for  chess  position 
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[Hef.  27].  In  both  these  studies,  the  experts  in  a  parti¬ 
cular  domain  could  memorize  information  from  that  domain 
(i.e.  a  program  or  a  chess  position)  far  better  than 
novices,  provided  that  the  information  was  appropriately 
structured.  If  the  structure  was  made  random  (by  shuffling 
the  statements  of  the  program  or  rearranging  the  chess 
pieces)  ,  the  advantage  of  the  expart  would  be  greatly 
reduced.  That  means  that  the  expert  has  no  better  memory 
than  the  novice,  but  rather  an  elaborate  knowledge  structure 
in  terms  of  which  correspondingly  structured  items  can  be 
very  efficiently  encoded  [Hef.  tS  ]. 

If  this  result  is  extended  to  programming,  if 
suggests  that  the  expert  programmer  gets  his  better  know¬ 
ledge  of  programs  from  visible  program  structure.  As  noted 
above,  if  the  textual  representation  is  not  structured  (e.g. 
random)  ,  the  expert  programmer  will  lose  part  of  his  capa¬ 
bility.  People  understand  something  better  when  they  can 
integrate  it  with  what  they  already  know.  From  this  view, 
standardization  helps  people  to  understand  other  people's 
programs  more  quickly.  The  visual  cues  are  important  in 
order  to  unburden  the  program  reader.  The  final  objectives 
of  computer  program  standards  are  to  ensure  consistency, 
reduce  program  development  and  testing  time,  improve  main¬ 
tainability  of  programs,  and  improve  changeability  of 
programs  [Hef.  12].  Programming  standards  are  not  intended 
to  stifle  the  imagination  of  programmers.  Experiments  of 
Godfrey  [Hef.  12]  have  shown  that  standards  simply  remove 
the  drudgery  of  coding  and  allow  programmers  to  concentrate 
more  on  the  problem  at  hand.  It  should  be  noted  that  the 
estabishment  of  standards  is  a  costly  process.  It  should  be 
kept  in  xind  that  programming  standards  are  not  a  panacea 
for  eliminating  all  poorly  written  programs.  Adherence  to 
these  standards  will  not  automatically  produce  'good'  code 
[Hef.  12]. 
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There  are  multiple  levels  of  understanding  a 
program.  It  is  possible  to  follow  each  line  of  code  without 
understanding  the  overall  program  function.  It  may  also  be 
possible  to  understand  the  program  function  but  not  under¬ 
stand  each  of  the  steps.  There  is  also  a  middle  level  of 
understanding  concerning  control  structures,  module  design, 
and  data  structures  £Ref.  17].  Skimming  for  a  top  down  view 
is  to  suppress  detail  until  the  overall  program  is  under¬ 
stood.  Then  the  program  is  read  selectively  and  understood 
in  more  detail. 


4 .  Example 


The  following  example  will  show  how  the  reformat¬ 
ting,  level  structured  documentation,  and  the  standardiza¬ 
tion  help  the  readability  and  understandability  cf  a 
program. 

The  bubble-scrt  algorithm  will  be  introduced  for 
this  example  [Ref.  18].  The  idea  of  the  bubble  scrt  is  as 
follows:  "We  go  through  a  list  comparing  adjacent  items  and 

exchanging  those  that  are  out  of  order.  During  such  a 
compare-and-exchange  pass,  an  item  moves  forward  in  the  list 
until  it  'bumps  up  against'  a  larger  item."  [Ref.  18],  An 
algorithm  language  [Ref.  18]  and  structured  FORTRAN  will  be 
used  for  this  example. 


THE  ALGORITHM  EOF  BOEELE_SORT  : 

ALGORITHM  BOBBLE  SORT 
INPOT  N 

INPOT  LIST  ( 1  :  N) 

REPEAT 

NO-EXCHANGES  <—  TROE 
FOR  I  <--  1  TO  N  -  1  DO 

IF  LIST  (I)  >  LISTJI+1)  THEN 
TEMP  <--  LIST  (I) 

LIST(I)  <--  LIS T  [1+  1 ) 
LIST  (1+1)  <--  TEMP 
NO-EXCHANGES  <--  FALSE 
END  IF 
END  FOR 

ONTIL  NO-EXCHANGES 
OOTPOT  LIST  (  1 :  N) 

END  BOBBLE-SORT 


UNFORMATTED  FORTRAN  EROGRAM  FOR  BOBBLE  SORT  : 


INTEGER  LIST  (100)  ,I,N,TEMP 
LOGICAL  NOEXG 
READ  (5,100)  N 
100  FORMAT  (15) 

READ  (5,1  00)  LIST 
5  CONTINUE 

NCEXG  *  .TROE. 

DO  777  1=  1  ,N-1 

IF  (.NOT.  (LIST  (I)  .  GT.  LIST  (1+1)  )  GO  TO  10 
TEMP  *  LIST(I) 

LIST  (I)  =  LIST  (1+1) 

LIST  jl  +  1)  =  TEMP 

NOEXG  =  .FALSE. 

10  CONTINUE 
777  CONTINUE 

IF  (.  NOT.  NO  EXG)  GO  TO  5 
WRITE  (6, 200)  LIST 
200  FORMAT  (IX,  17) 

STOP 

END 


The  following  shows  some  of  the  possible  outputs  of 
an  extended  pretty  printer.  Indentation  is  used  to  improve 
readability.  Selective  display  of  the  levels  of  the  control 
structure  of  the  program  both  in  FORTRAN  and  in  a  general¬ 
ized  programming  language  is  used  to  support  improved  under- 
srandability.  The  reader  selects  the  textual  representation 
that  best  supports  his  current  perceptual  and  cognitive 
needs. 


LEVEL  IA  : 

INTEGER  LIST  (10  0)  ,  I,  N,  TEMP 
LOGICAL  NCEXG 

READ  (5 , 1 0  0)  N 
READ  (5,100)  LIST 
R  EP£ AT 

COMPOUND  STATEMENT 
UNTIL  NOT  NOEXG 
WRITE (6,200)  LIST 


STOP 

100  FORMAT 
200  FORMAT 


END 


LEVEL  IB  : 


DECLARATION 

DECLARATION 


SIMPLE  STATEMENT 


SIMPLE  STATEMENT 
REPEAT 

COMPOUND  STATEMENT 
ENDUNTIL 

SIMPLE  STATEMENT 

STOP 

END 


This  shows  the  first  level  of  babble  sort  program. 
Here  only  the  repeat  condition  is  represented,  so  the 

reader  of  the  program  can  see  simply  the  highest  level 

structure  of  the  program  and  can  understand  the  overall 

design  of  the  program  more  easily.  The  reader  can  then 

select  presentations  that  show  additional  levels  until  the 
completed  program  is  displayed. 


LEVEL  II A 


INTEGER  LIST  (100),  I,  N,  TEMP 
LOGICAL  NOEXG 


READ  (5  ,10  0)  N 
READjS.lOO)  LIST 
5  CONTINUE 

NOEXG  =  .TRUE. 

FOR  I  =  1  TO  N  -  1 

COMPOUND  STATEMENT 
ENDFOR 

IF  (.  NOT.  NOEXG)  GO  TO  5 
WRITE  (6,200)  LIST 

STOP 

100  FORMAT  (15) 

200  FORMAT  (IX,  17) 

END 


LEVEL  HE  : 


DECLARATION 

DECLARATION 

SIMPLE  STATEMENT 
SIMPLE  STATEMENT 
R  E  PE  A  T 

SIMPLE  STATEMENT 
DC  FOR 

COMPOUND  STATEMENT 
ENDFOR 
ENDHEPEAT 
SIMPLE  STATEMENT 

STOP 

END 
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LEVEL  III  A 


INTEGER  LIST  (100),  I,  N,  TEMP 
LOGICAL  NCEXG 

5,100)  N 
5,100)  LIST 
NOE 

NOEXG  *  .TRUE. 

DO  777  I  -  1  ,  II  -  1 

IF  LIST  (I)  >  LIST  (1+ 1)  THEN 
COMPOUND  STATEMENT 

ENDIF 

777  CONTINUE 

IF  (.NOT.  NOEXG)  GO  TO  5 
WRITE  (6,  200)  LIST 

STOP 

100  FORMAT  (15) 

200  FORMAT  (IX,  17) 

END 


LEVEL  HIE. 

DECLARATION 

DECLARATION 

SIMPLE  STATEMENT 
SIMPLE  STATEMENT 
REPEAT 

SIMPLE  STATEMENT 
DO  FOE 

IF  CONDITION  THEN 

COMPOUND  STATEMENT 

ENDIF 
ENDFOR 
ENDREPEAT 
SIMPLE  STATEMENT 

STOP 

END 


READ  ( 
READ  ( 
5  CCNTI 


Fcr  most  experienced  programmers  who  are  familiar 
with  top  down  design  with  stepwise  refinement,  the  following 
representations  are  easier  to  read  and  understand  than  the 
intial  programs. 


FINAL  SOURCE  PROGRAM 


INTEGER  LIST  (100)  , I, N, TEMP 
LOGICAL  NOEXG 


READ  (5,100)  N 
- 100)  LIST 


READ  (StIC 
CONTINUE 
NCEXG  =  .TRUE. 

DO  777  1*1. N-1 

IF  (.NOT.  (LIS  T  (I)  .GT. LI  ST  (1+1)) 
TEMP  *  LIST  (I) 
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GO  TO  10 


/ 


LIST 
LIST 
NCEX 

10  CONTINUE 

111  CONTINUE 

IF(.  NOT.  NOEXG)  GO 
WRITE  (6,200)  LIST 

STOP 

100  POEM  AT  (15) 

200  FORMAT  IX,  17) 

END 


(I)  *  LIST  (1*1) 

]l+1)  _  TEMP 


FALSE. 


TO  5 


FINAL  STRUCTURE  DOCUMENTATION  : 

DECLARATION 

DECLARATION 


SIMPLE  STATEMENT 
SIMPLE  STATEMENT 
R  E  PE  AT 

SIMPLE  STATEMENT 
DO  FOR 

IF  CONDITION  THEN 

SIMPLE  STATEMENT 
SI MPLE  STATEMENT 
SIMPLE  STATEMENT 
SI MPLE  STATEMENT 


ENDIF 
ENDFOR 
ENDREPEAT 
SIMPLE  STATEMENT 

STCP 

END 
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III.  5QH1  AEPROACHES  AMD  VARIOUS  OBJECTS 
A.  SOME  APPROACHES 

There  have  been  many  attempts  to  improve 
understandabilit  y  and  readability.  The  following  are  typical 
examples. 

1 .  Neater  2 

Neater2  accepts  a  PL/I  source  program  and  operates 
on  it  to  produce  a  reformatted  version.  When  in  the  LOGICAL 
mode,  it  indicates  the  logical  structure  of  the  source 
program  in  the  indentation  pattern  of  its  output.  A  number 
of  options  are  available  to  give  the  user  full  control  over 
the  output  format  and  to  maximize  its  utility.  [Ref.  19] 

2 .  Pretty print 

It  takes  as  input  a  Pascal  program  and  reformats  the 
program  according  tc  a  standard  set  of  pretty  printing 
rules.  The  pretty  printing  rules  are  given  i.e.,  fixed. 
[Ref.  22] 

3  •  Pascal  Program  Form  aft  er 

Fermat  is  a  flexible  pretty  printer  for  Pascal 
programs.  It  takes  as  input  a  syntactically-correct  Pascal 
program  and  produces  as  output  an  equivalent  but  reformatted 
Pascal  program.  The  resulting  program  consists  of  the  same 
sequence  of  Pascal  symbols  and  comments,  but  they  are  rear¬ 
ranged  with  respect  tc  line  boundaries  and  columns  for 
readability.  [Ref.  20] 
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of  Format  is 


The  flexibility  of  Format  is  accomplished  by 
allowing  the  user  tc  supply  various  directives  (options) 
which  override  the  default  values.  Bather  than  being  a  rigid 
pretty  printer  which  decides  how  a  program  is  to  be 
formatted,  the  user  has  the  ability  to  control  how  format¬ 
ting  is  done,  rot  cnly  prior  to  execution  but  also  during 
execution  through  the  use  of  pretty  printer  directives 
embedded  in  the  program.  [  Bef .  20] 


4 .  Contour 

It  is  a  program  whose  purpose  is  to  graphically 
illustrate  a  program's  structure.  It  operates  by  bounding 
the  scope  of  loops  and  conditionals  by  solid  (or  nearly 
solid)  lines.  When  compound  statements  are  embedded  in  ether 
compound  statements,  one  obtains,  rather  than  confusion,  a 
rather  pleasant  display  reminiscent  of  the  contour  lines  of 
a  topographical  map.  [Hef.  22] 


5.  Syntax- Directed  Pretty  Printer 


It  is  a  language  independent  pretty  printer.  It  is 
divided  into  two  phases:  the  grammer  processing  phase  and 
the  program  processing  phase.  A  language  grammar  for  the 
specific  language  must  be  provided.  It  is  much  easier  and 
quicker  tc  write  a  grammar  for  a  language  than  tc  cede  a 
new  pretty  printer  for  a  specific  language.  It  can  work  fer 
all  structured  programming  languages,  and  with  minor  modifi¬ 
cations,  can  work  for  other  languages.  It  can  handle  such 
problems  as  comments  and  error  recovery.  [Bef.  14] 

6 .  Others 


The  recent  availability  cf  low  cost,  high  guality 
computer  printers  allows  additional  opportunities  to  improve 
readability  and  understanda bility .  Important  characters  or 
words  can  be  represented  with  different  fonts:  for  instance. 
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the  keywords  can  be  represented  by  bold  characters  cr  be 
underlined  to  be  recognized  more  easily  than  other  words. 
This  can  improve  the  readability  of  program. 

B.  VAHIGOS  OBJECTIVES 

Although  the  final  objective  of  all  approaches  is  to 
improve  the  readability  and  understandabilty  of  the  program, 
there  are  many  secondary  objectives.  The  following  are 
typical  examples  of  them: 

Teaching  structure:  An  automatic  system  that  checks 

structure  and  indentations  can  help  beginning  students  learn 
good  programming  practice.  A  system  that  gives  clear 
corrections  to  mistakes  can  provide  a  student  with  quick 
feedback.  Such  a  system  helps  a  student  to  learn  structured 
programming  and  to  learn  a  set  of  programming  standards. 

Standardization  in  a  programming  organization:  For 

large  software  projects  with  many  programmers,  program 
standardization  is  necessary  to  help  in  communication  among 
programmers. 

Reformatting  for  maintenance:  There  are  many  programs 

that  are  very  difficult  to  read  The  maintenance  process 

can  be  helped  if  pregrams  can  be  transformed  into  a  form 
that  is  familiar  to  the  maintenance  programmers.  The 
scoping  capability  of  an  extended  pretty  printer  as 
described  above  can  alsc  help  programmers  understand 
programs  they  are  correcting  and  modifying. 

Automatic  corrections:  An  extended  pretty  printer  can 

check  the  indentation  of  programs,  correct  indentation 
errors,  and  give  the  user  messages  explaining  the  errors. 
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From  the  above  observations,  several  common  parrs  of  th« 
existing  approaches  can  be  found.  First,  most  of  the  systems 
are  for  a  specific  programming  language,  for  another 
programming  language  they  would  have  to  be  written  again. 
The  one  exception  is  the  syntex  directed  pretty  printer;  for 
each  new  language  it  requires  a  grammar  for  each  the 
language.  Defining  a  correct  grammar  is  not  an  easy  task. 
Second,  most  of  the  systems  try  to  make  the  pretty  printer 
flexible,  but  the  flexibility  is  limited  to  a  few  options 
and  it  is  not  easy  tc  extend  the  requests.  Most  constructs 
cf  the  pretty  printers  are  fixed,  but  the  constructs  them¬ 
selves  can  be  changed  e.g.  extended  or  contracted.  New 
structures  for  indentation  can  be  generated. 
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IV.  PROGRAM  FAMILY 


A.  DEFINITION 

Program  families  are  defined  by  Parnas  [Ref.  13]  as  sets 
of  programs  whose  common  properties  are  so  extensive  that  it 
is  advantageous  to  study  the  common  properties  of  the 
programs  before  analyzing  individual  members.  Program  fami¬ 
lies  are  analogous  tc  the  hardv/are  families  promulgated  by 
several  manufacturers.  Although  the  various  models  in  a 
hardware  family  might  not  have  a  single  component  in  common, 
almost  everyone  reads  the  common  ’principles  of  oprations' 
manual  before  studying  the  special  characteristics  of  a 
specific  model  [Ref.  13]. 

B.  DESIGN  METHODOLOGY 

Parnas  [Ref.  13]  shows  how  module  specifications  define 
a  family.  This  is  an  important  guide  for  selecting  the 
design  method.  Members  of  a  family  of  programs  defined  by  a 
sat  of  module  specifications  can  vary  in  three  principal 
ways . 

1.  Implementation  methods  used  within  the  modules. 

Any  combination  of  sets  of  programs  which  meet  the 
module  specifica tion s  is  a  member  of  the  program  family. 
Subfamilies  may  be  defined  either  by  dividing  each  of  the 
main  modules  into  submodules  in  alternative  ways,  or  by 
using  the  method  of  structured  programming  to  describe  a 
family  of  i aple menta tions  for  the  module. 
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2.  Variation  in  the  external  parameters. 

The  module  specifications  can  be  written  in  terms  of 
parameters  so  that  a  family  of  specifications  results. 
Programs  may  differ  in  the  values  of  those  parameters  and 
still  be  considered  to  be  members  of  the  program  family. 

3.  Use  cf  subsets. 

In  many  situations  one  application  will  require  only  a 
subset  of  the  functions  provided  by  a  system.  We  may 
consider  programs  which  consist  of  a  subset  of  the  programs 
described  by  a  sat  of  module  specifications  to  be  members  of 
a  family  as  well. 

As  discussed  above,  there  are  many  primary  and  secondary 
objectives  for  a  pretty  printer.  One  approach  to  these 
various  demands  would  be  to  design  a  large  program  with  many 
cptions.  Such  an  approach  has  several  drawbacks:  first,  the 
resulting  program  would  be  large  and  necessarily  complex, 
second,  for  each  specific  use  of  the  program  the  unneeded 
options  will  most  likely  impose  an  unnecessary  computational 
burden.  The  notion  cf  a  program  family  offers  an  alterna¬ 
tive  design.  A  separate  program  will  be  written  for 
different  demands,  however,  all  these  programs  will  share  a 
common  design  and  many  modules  will  be  common  to  several 
family  members. 

The  concept  of  program  families  provides  one  way  of 
considering  program  structure  more  objectively.  For  any 
precise  description  cf  a  program  family  (either  an  incom¬ 
plete  refinement  of  a  program  or  a  set  of  specification  or  a 
combination  of  both)  one  may  ask  which  programs  have  been 
excluded  and  which  still  remain  [Ref.  13].  The  criteria  of 
defining  modules  can  be  a  way  to  select  or  distinguish  seme 
design  methodologies  [Ref.  3]. 


C.  PBOGBANHING  LANG 0 AGE  FOB  OBJECT  ORIENTED  DESIGN 


A  design  methodology  alone  is  not  sufficient  tc  create 
computer  solutions  [ Bef .  3].  Some  features  of  a  programming 
language  can  also  help  in  creating  good  software.  In  the 
following  table,  P.  Wegner  has  categorized  some  of  the 
most  popular  languages  into  generations,  along  with  seme  of 


TABLE  I 

Programming  Language  Generation  Table 


Generation 

languages 

Per 

iod 

1ST 

FORTRAN  I,  ALGOLS  8 

1  954 

-  1958 

2ND 

FORTRAN  II,  ALGOL60 
COEOL,  LISP 

1959 

-  1961 

3RD 

PI/I.  ALG0L68, 

PASCAL 

196  2 

-  1970 

GAP 

1970 

-  1980 

the  language  features  they  introduced: 

ACA  was  developed  at  the  end  of  the  language  genera¬ 
tion  g&F,  and  sc  has  been  influenced  by  contemporary  soft¬ 
ware  methodologies.  The  following  figures  show  the 

topologies  cf  each  generation  and  ADA.  ADA's  topology  is 
not  flat  like  those  of  the  previous  generations,  but 
rather  is  multi-dimensional  [Bef.  3], 


Figure  4.1  Topology  for  1st  and  2nd  Generation 


Figure  4.3  Topology  of  ADA. 

The  following  key  features  of  ADA  will  support  the 
tools  for  implem entirg  the  object  oriented  design 

[Bef.  23]. 

1.  Prcgramiing  in  the  large. 

Mechanisms  for  encapsulation,  separate  compilation,  and 
library  management  are  necessary  for  the  writing  of  portable 
and  maintainable  programs  of  any  size. 

2.  Exception  handling. 

large  programs  are  rarely  correct.  It  is  necessary  to 
provide  a  means  whereby  a  program  can  be  constructed  in  a 
layered  and  partitioned  way  so  that  the  consequences  of 
errors  in  one  part  can  be  contained. 

3.  Data  abstraction. 

Extra  portability  and  maintainability  can  be  obtained 
if  the  details  of  the  representation  of  data  can  be  kept 
separate  from  the  specifications  of  the  logical  oprations  on 
the  data. 
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4.  Tasking. 

For  many  application  it  is  important  that  the  program 
be  conceived  as  a  series  of  parallel  activities  rather 
than  just  as  a  single  sequence  of  actions.  Building  appro¬ 
priate  facilities  into  a  language  rather  than  adding  them 
via  calls  to  an  operating  system  gives  better  portability 
and  reliability. 

5.  Generic  units. 

In  many  cases  the  logic  of  part  a  program  is  independent 
of  the  types  of  the  values  being  manipulated.  A  mechanism 
is  therefore  necessary  for  the  creation  of  related  pieces  of 
program  from  a  single  template.  This  is  particularly  useful 
for  the  creation  of  libraries. 
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V.  HI  SOIOTIOH 


1.  PBOBIEH  AND  SOLOTIOH 

As  shewn  above,  most  traditional  approaches  to  pretty 
printers  are  for  a  specific  programming  language.  A  recent 
development  is  the  syntax  directed  pretty  printer  that  can 
be  used  for  different  languages  by  providing  a  grammar  of 
the  language.  The  requirement  to  provide  a  language  grammar 
represents  a  non-trivial  task.  There  are  many  different 
secondary  objectives  for  a  pretty  printer  for  different 
users.  The  functions  of  a  traditional  pretty  printers  are 
not  enough  to  improve  both  the  readability  and  understand- 
ability  e.g.  the  program  level  construct  documentation  that 
traditicral  approaches  do  not  support  is  needed  to  help  to 
understand  a  given  program.  In  short,  there  are  many 
programming  languages  and  many  purposes,  but  there  is  not  a 
system  that  satisfies  all  those  requests  and  can  be  modified 
easily. 

In  the  previous  section,  the  concept  of  a  program  family 
was  discussed.  The  best  way  to  solve  the  various  demands  and 
many  programming  languages  is  to  construct  a  program  family 
for  the  extended  pretty  printer.  The  characteristics  of 
program  family  will  permit  easy  change,  easy  extension,  and 
easy  contraction.  Each  programming  language  will  have  a 
module  for  itself  and  data  abstraction  and  procedural 
abstraction  will  be  used  to  hide  design  decisions  that  will 
differ  among  the  members  of  the  program  family.  Data  and 
procedural  abstraction  will  also  allow  some  modules  to  be 
used  by  all  program  family  members.  For  example,  the  blank 
operations  are  a  important  data  abstraction.  These  oprations 
can  be  used  for  all  programming  languages  and  objectives. 


B.  GENERALIZED  PROG  BANNING  LANGUAGE  CONSTRUCT 

Fcr  generalized  indentation  and  level  documentation,  ar 
general  internal  representation  of  program  structure  is 
required  that  is  independent  of  any  particular  programming 
language.  Let  us  call  it  a  generali zed  formatter  stucture. 
Since  there  are  many  programming  language  constructs  in  the 
many  different  programming  languages,  it  is  too  difficult  to 
define  a  perfect  universal  programming  languge  formatter 
construct.  So,  we  define  here  a  generalized  programming 
language  formatter  construct  that  can  cover  only  a  limited 
number  of  programming  languages  -  structured  FORTRAN,  PASCAL 
and  seme  ether  structured  programming  languages.  For  simpl¬ 
icity,  the  detailed  representation  of  a  simple  statement 
will  be  emitted. 

The  structure  of  the  program  will  be  shown  by  indenting 
the  constructs.  First,  the  control  structure  will  be 
considered.  Dijkstra  argued  that  control  flow  should  be 
limited  tc  three  basic  structures  -  linear  sequence,  struc¬ 
tured  selection,  and  structured  iteration.  But  many  program¬ 
mers  use  the  following  five  structures  -  if,  case,  while, 
until,  dc  for.  Also  the  block  can  be  a  element  of  the  struc¬ 
ture.  Second,  most  program  units  are  divided  into  two  parts: 
a  declarative  part  and  imperative  part.  This  is  also  impor¬ 
tant  for  the  indentation.  The  Appendix  A  describes  in  detail 
the  generalized  format  structures. 
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C.  ANALYSIS  AND  DESIGN 


1 .  Analysis 

The  extended  pretty  printer  has  two  basic  functions. 
The  first  is  to  reformat  the  source  program  e.g. 
indent, insert  spaces  and  linefeeds  between  tokens  and  tc 
decide  where  and  how  to  break  lines  that  are  tco  long  tc  fit 
on  the  output  medium.  The  second  is  to  produce  level  struc¬ 
ture  documentation  of  the  source  program.  The  basic  require¬ 
ment  of  the  total  system  is  that  it  has  to  be  easy  to 
change,  easy  to  extend,  easy  to  contract,  e.g.  it  should  be 
independent  of  the  programming  language  and  should  be  able 
tc  fulfill  a  variety  cf  purposes. 

Every  structured  programming  language  can  be  repre¬ 
sented  as  English  is  -  character,  word,  statement,  compound 
statement  (paragraph)  ,  unit  program  (a  paper).  What  is  of 
interest  is  the  way  to  represent  these  component  as  lines. 
Th9  relationship  of  these  components  and  lines  is  very 
important  fcr  the  extended  pretty  printer.  The  following 
table  represents  the  relationship  of  line  and  statement. 
The  other  components  have  some  relation  with  the  state¬ 
ments.  So,  every  component  can  be  represented  by  lines. 

Each  level  is  represented  by  the  source  pregram 
structures.  The  structures  are  represented  by  statements. 
So,  each  statement  can  have  a  level  degree. 

2 .  Design 

As  noted  in  the  section  on  program  families,  the 
most  important  aspect  of  this  system  design  is  to  identify 
the  objects.  Por  the  indentation,  the  line  and  statement  are 
basic  elements.  Black  is  other  important  object.  For  the 
construct  representation,  level  has  to  be  a  object. 
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TABLE  II 

Belationship  Table 


LINE 

STATEMENT 

one 

one 

one 

many 

j  one  |  part 

one 

part  and  | 

one/many  j 

("part"  aeans  part  of  a  statement) 

The  heavily  dependent  parts  should  be  encapsulated 
in  a  module  to  allow  for  easy  change.  The  indentation 
policy  can  be  changed  variously,  it  needs  to  be  manipulated 
independently.  To  manipulate  the  input  programming  languages 
independently,  the  program  should  be  a  indepedent  module. 
The  program  module  needs  some  data  structures  -  STACK, 
QUEUE  -,  Keywords  table,  and  some  statement  oprations.  The 
files  -  input  source  file  and  output  file  -  and  their  format 
can  be  changed  easily.  So,  the  input/output  files  manipula¬ 
tions  need  be  separated  from  other  modules. 

For  convenience,  the  module  will  be  divided  into  two 
kinds.  One  is  passive  modules  that  are  used  by  other  modules 
but  that  do  not  use  other  modules,  for  example,  blank, 
level,  stack,  queue  and  line.  The  other  kind  is  active 
modules  that  use  the  other  modules,  for  example,  input, 
output,  program  and  sc  on.  ADA  will  be  used  for  the  detailed 
design  of  the  systea.  The  following  shows  the  detailed 
design. 
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'  :  -a 


a.  Passive  nodules 

0)-  Stack  Module.  This  module  provides  seme 
stack  opratiens.  And  it  provides  the  following  procedures 
for  other  modules  that  use  them  [Ref.  24]. 

generic  type  ITEM  is  private 
package  STACK  is 

type  LIST  is  private; 

procedure  C  BE  AT  E  (L:  out  LIST);  ' 

procedure  P0SH(L:  in  out  LIST;  I:  in  ITEM); 
procedure  P0P(L:  in  out  LIST); 
function  T0P(L:  LIST)  return  ITEM; 
underflow  ;  EXCEPTION ; 
private  type  NODE; 

type  LIST  is  access  NODE; 
type  NODE  is  record 
head  :  ITEM; 

tail  :  LIS I; 

end  record; 
end  STACK; 

(2)  .  Que  ue  Module.  This  module  provides  seme 
QUEUE  oprations.  And  it  provides  the  following  procedures 

for  ether  modules  that  use  them  [Ref.  24]. 

generic  type  ITEM  is  private; 

package  QUEUE  is 

type  LIST  is  private; 

procedure  CREATE  (L:  out  LIST); 

procedure  ENQUEUE(L:  in  out  LIST;  I;  in  ITEM); 

—  Insert  the  item  into  the  rear  of  QUEUE 

procedure  DEQUEUE(L:  in  out  LIST;  I:  out  ITEM); 

—  Delete  the  item  from  the  front  of  QUEUE 

underflow  ;  EXCEPTION; 
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private  type  NODE; 

type  LIST  is  access  NODE; 
type  NODE  is  record 
head  :  ITEM; 
tail  :  LIST; 
end  record; 
end  QUEUE; 

(3)  •  Blank  Module .  This  module  provides  all 
blank  operations  that  insert,  remove,  count  and  so  on  for 
other  modules  that  need  the  blank  operations. 


generic  type  INPUT  is  private; 
package  ELANK  is 

ELK  ;  constant  CHARACTER  ; 

type  NUM  is  NATURAL; 

procedure  INSERT<N,M:  in  NUM;  P:  out  INPUT); 

N  :  The  start  column  of  a  line 
M  :  The  number  of  blanks  to  be  inserted 

procedure  DELETER, M:  in  NUM;  P:  out  INPUT); 

N  ;  The  start  column  of  a  line 
M  ;  The  number  of  blanks  to  be  deleted 

procedure  START  (L;  in  INPUT;  N;  out  NUM); 

N  ;  The  number  of  blanks  in  a  line 
from  the  start  column 

function  IS  BLANK  (C;  in  CHAR);  return  BOOLEAN; 
Check  *he  input  character  is  blank 
If  blank,  return  TRUE 
Else,  return  FALSE 

overflow  :  EXCEPTION; 

end  BLANK; 


(4).  Levei  Module.  This  module  will  provides 
the  level  operations  for  other  modules  that  need  them.  The 
operations  are: 


package  LEVEL 


is 


type  NUN  is  NATURAL; 

procedure  INCREASE  (L:  in  out  NUM) ; 

Increase  the  level 
L  ;  input/cutput  level  number 

procedure  DECRE  ASE  {L:  in  out  NUN)  ; 

Decrease  the  level 

procedure  ZEROiL:in  out  NUM); 

Make  the  level  zero  or  starting  point. 

overflew  :  2XCEFTION; 

underflow  ;  EXCEPTION; 

end  LEVEL; 

(5)  .  Lins  Module.  This  module  manages  the  line 
object.  It  provides  a  set  of  procedures  available  to  ether 
modules  that  use  the  line. 


Generic  type  LINETYPE  is  private; 
package  LINE  is 

type  LINEPOINT  is  private; 
type  NUM  is  NATURAL; 
type  CHAR  is  CHARACTER; 
procedure  GET  LINE 

(P:  in  out  LINEFCINT;  L:  out  LINETYPE)  ; 

Get  a  whole  line  into  internal  structure 
P  ;  ID  for  a  line 
L  ;  Content  of  a  line 

procedure  PUT  LINE 

(P:  in  out  LINEPOINT;  L:  in  LINETYPE)  ; 

Put  the  a  internal  line  into  the  linetype 
P  ;  ID  for  a  line 
--  L  ;  Content  of  a  line 

procedure  LINE  LENGTH 

(P:  in  LINEPOINT;  N:  out  NUM); 

Compute  the  line  length 
P  :  ID  for  a  line 

procedure  GET  CHAR 

(P:  in  LI NE POINT ;  N:  in  NUM;  out  CHAR)  : 

Get  a  character  that  is  in  given  line  and 
pcsiton 

P  ;  ID  for  a  line 
procedure  PUT  CHAR 

(P;  in  LINEPOINT;  N;  in  ITEM;  in  CHAR); 

Put  the  given  character  into  the  position 
and  the  line  given 


mSBaBWBWBW* 
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P  :  ID  for  a  line 
N  :  coluiin  of  the  line 


procedure  FRONT  INSERT 
(P:  in  out  LINETCJNT:  L:  in  LINETYPE)  ; 
Insert  line  m  rpont  of  the  given 
line  position 
P  :  ID  for  a  line 
L  :  Content  of  a  line 

procedure  REAR  INSERT 

IP:  in  out  LINEPOINT;  L:  in  LINETYPE); 
Insert  the  line  at  rear  of 
the  given  line  position 
P  :  ID  for  a  line 
L  :  Content  of  a  line 

underflow  :  EXCEPTION; 

overflow  :  EXCEPTION; 

private  type  NODE; 

type  LINEPOINT  is  access  NODE; 

type  NODE  is  record 

content  :  LINETYPE; 

front  :  LINEPOINT; 

rear  :  LINEPCINT; 

end  record; 

end  LINE; 


(6)  .  Svmbol  Table  Module.  This  module  will 
manage  a  symbol  table.  it  is  designed  for  general  symbol 
manipulation . 


Generic  type  ITEMTYPE  is  private; 

package  SYMBOLTABLE  is 

N  :  constant  =:  200;  —  size  of  symbol  table 

ITEMSIZE:  constant  =  :  20; 

type  ITEM  is  new  STRIN  G  (1 ..  ITEMSIZE)  ; 

procedure  A DD  ( X ; in  ITEM;  I:  in  ITEMTyPE)  ; 

Insert  an  item  and  the  information 
associated  with  it  into  SYMBOLTABLE 

function  IN  TAB  IE (X :in  ITEM)  return  BOOLEAN; 
Check  fo  see  if  an  item  is  in 
the  SYMBOLTABLE 

function  GET (X : in  ITEM)  return  ITEMTYPE; 
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Retrieve  the  information  associated 
with  an  item  in  the  SY MBOLT  ABLE 

function  PULL  return  BOOLEAN; 

Determine  whether  or  not  the  SYMBOLTABLE 
is  full 

procedure  CLEAR;  --  empty  table 

Reinitialize  (reset)  the  SYMBOLTABLE 


end  S YMBCITABLE ; 


fc.  Active  modules 


(1 )  .  In  put  Module.  This  module  hides  the  input 
format.  It  reads  the  original  lines  from  the  input  media 
and  calls  procedures  provided  by  the  line  module  to  store 
the  lines  inside  of  the  line  object. 


with  TEXT_IC ; 
with  LINE; 

generic  type  LINEPOINT  is  private; 

package  INPUT  is 

type  INFILETYPE  :  TEXT _IO . FILE, TYPE ; 
procedure  READFIIE 

1INFILE;  in  INFILETYPE:  START  :  out  LINEPOINT^ ; 

Read  the  input  file  ana  store  each  line  into 

—  internal  line  structure  using  LINE  module 
INFILE  :  The  input  file  that  have  source  orcgram 
START  :  The  starting  line  ID  of  internal' 

—  structure 


end  INPUT 


(2)  .  Output  Module.  This  module  will  hide  the 
cutfile  media.  And  it  will  output  the  indented  results,  the 
construct  form  of  the  input  program  and  the  input  using 
other  modules  -  indent,  line  and  so  on. 

with  TEXT_IC ; 
with  LINE; 
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with  INDENT; 

generic  type  LINETYPE  is  private; 
package  ODTPUT  is 

type  OUTFILETYPE  ;  TEXT_IO.FILE_TYPE; 
type  CCDEFI LETY PE  :  TEXT_IO.FILE_TYPE; 
procedure  PRINT  CDTFILE 

(OOTFILE:  out  IUFILETYPE;  START  :  in  LINEPOINT).  ; 
Print  the  indented  output  into  OUTFILE  using 
indent  and  line  modules 
OOTFILE  :  The  output  file  that  has 

the  indented  source  Drcgram 
START  :  Line  start  ID  of  internal  structure 

procedure  PRINT  CODEFILE 

(CODEFILE:  cut  INFILET  YPE;  START  :  in  LINEPOINT); 

Print  the  code  documentation  using  line  and 
—  indent  module 

CODEFILE  :  The  output  file  that  has 
the  code  documentation 

START  :  Line  start  ID  of  internal  structure 

end  OOTPOT 


(3)  .  Statement  Module.  This  module  manages  the 
statement  object  and  also  provide  a  set  of  procedures  avai¬ 
lable  to  other  modules  that  use  the  statement  object  by 
using  line  module  procedures. 


With  LINE; 
generic 

type  INDENTPOINT  is  private; 
package  STATEMENT  is 

type  NUM  is  NATURAL; 

type  CHAR  is  CHARACTER; 

type  INDENTPOINT  is  access  INDENTNODE; 

type  STATEPOINT  is  access  NODE; 

type  NODE  is  record 

content  :  STATETYPE; 

front  :  STATEPOINT; 

rear  ;  STATEPOINT; 

end  record; 
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type  STATETYPE  is  record 

from  :  POSITION; 

to  :  POSITION; 

information  :  INDENTPOINT; 

end  record; 

type  POSITION  is  record 

line  :  LINEPCINT; 

column  :  NON; 

end  record; 

procedure  GET  STATE  DE  LIM ITOR  (D :  in  CHAR); 

Get  statement  Jelimitor 

function  END  OF  STATE (D:  CHAR)  return  BOOLEAN 
—  Check  tie  Snd  of  a  statement 

procedure  GET  STATE 

(P:  In  out  STETEPOINT;  L:  out  STATETYPE)  ; 

Get  a  statement  using  LINE  module 

procedure  POT  STATE 

(P :  in  out  STXTEPOINT;  L:  in  STATETYPE); 

Put  a  statement  using  LINE  module 

procedure  STATE  LENGTH 

XP:  in  STATEPOITJTj  N:  out  NON)  ; 

Compute  the  length  of  a  given  statement 

procedure  RECOGNIZE  STATEMENT 
<?;  in  out  ST  AT  EPOITfT ;  L:  in  LINEPOINT)  ; 
Recognize  the  statement  from 

—  the  internal  line  structure 

procedure  GET  CHAR 

(P:  in  STATEPOINT:  N:  in  NON:  out  CHAR); 

Get  a  character  from  the  given  statement 

—  and  column 

procedure  POT  CHAR 

<P:  in  STATEPUINT;  N:  in  NON:  in  CHAR); 

Put  a  character  into  the  given  statement 
and  column 

procedure  FRONT  INSERT 

(P:  in  out  ST  AT  "EE  01  NT;  L:  in  STATETYPE); 
Insert  the  giv$n  statement  into 
front  of  the  given  statement  ID 

procedure  REAR  INSERT 

<P :  in  out  STATEPOINT;  L:  in  STATETYPE); 
Insert  the  given  statement  into 
rear  of  the  given  statement  ID 

underflow  ;  EXCEPTION; 

overflow  :  EXCEPTION; 

end  STATEMENT; 


(4)  •  Indent  Module.  This  module  will  indent 
each  line  using  the  line  module,  statement  module  and  blank 
module.  And  the  indentation  policy  can  be  decided  here  e.g. 
the  size  cf  each  level,  the  treatment  of  blanks,  and  so  on. 


with  BLANK; 
with  STATEMENT; 
with  LINE; 

generic  type  POLICYTYPE  is  private; 

type  CCNSTR UCTT  YFE  is  private; 
package  INDENT  is 

type  INDENTPOINT  is  access  INDENTNODE; 
type  INDENTNODE  is  record 

level  :  NOM; 


cons tr cut  :  CONSTROCTTYPE; 
end  record; 
procedure  INDENT 

(P:  in  STATEPOINT;  L:  out  LINETYPE)  ; 

—  Indent  a  line  i.  e .  insert  or  delete  blanks 

—  and  make  line  break  according  to  the  source 

—  program  syntex  using  the  information  about 

—  level  and  construct  type  and  so  on 


procedure  GET  POLICY 
(P:  in  POLICYTYPE); 

Get  the  indentation  and  objective  policies 
for  example,  each  level  has  3  blanks 
and  with  indentation  error  messages. 


procedure  POT  POLICY 
(P:  cut  POLICYTYPE)  ; 

Put  the  indentation  and  objective  policies 

procedure  GET  INFORMATION 

IP:  in  STATEPOINT;  L:  out  STATETYPE)  : 

Get  the  information  for  indentation 
and  level  documentation 


procedure  POT  INFORMATION 

(P:  in  STATEPOINT;  L:  in  STATETYPE); 

Put  the  information  for  indentation 
and  level  documentation 


end  INDENT; 
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(5)  .  Pfrcgra m  Module.  This  aodule  will  hide  the 
program  characteristics.  It  should  be  highly  dependent  or. 
each  programming  language.  It  have  two  procedures  -  scanner 
and  parser.. 


with  LINE; 
with  STATEMENT; 
with  ELANK; 
with  SYMECLTABLE; 
with  LEVEL; 
with  STACK; 
with  QOEOE ; 

package  FBOGRAMPART  is 
procedure  SCANNER 

(P:  in  out  STATEPOINT;  L:  out  ITEMTYPE)  ; 

Scan  the  source  program  and  recognize 
each  statement  type  for  parser 

procedure  PARSER; 

Recognize  the  construct  of  the  source 
program 

end  FROGEAMFART ; 


(6)  .  M aster  Module.  This  module  will  control 
all  abcve  modules. 


with  FROGRAMPART; 
with  INDENT; 
with  INPOT ; 
with  COTEUT; 

procedure  MASTER; 

—  Control  all  the  module  for  reformatting 
and  level  structure  documentation 
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Figure  5.1  Module  Interface. 


The  above  figure  explain  the  interfaces  of 
each  module.  The  arrow  direction  indicates  using  module. 


D.  EXAMPLE  (FORTH  AN) 

1*  Standard  Fojm 

There  have  been  many  attempts  to  standardize  the 
F03TRAN  programming  language.  Here,  the  standard  form  will 
follow  the  concept  cf  COMPATIBLE  FORTRAN  [Hef.  1].  The 
following  represent  the  rough  standard  form. 


jr  'V'J 
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a.  Basic  Components 

It  consists  of  four  elements  -  character  set, 
symbolic  names,  constants  and  array  elements. 

t.  Statements 

(1)  .  Statement  Components.  Statements  are  made 

up  of  such  components  as  labels,  keywords,  symbolic  names, 
constants  and  special  characters.  For  Compatible  FORTRAN,  a 
stricter  rule  shculd  be  observed:  (1).  Statement  labels, 

keywords,  symbolic  names,  integer  constants  should  not  have 
embedded  blanks,  except  for  key  words  GO  TO,  DOOELE 
PRECISION  and  BLOCK  CATA,  which  may  have  blanks  in  the  posi¬ 
tions  shewn.  (2)  .  Where  two  alphabetic  or  numeric  state¬ 
ment  ccmponents  ccme  together  with  no  other  special 
characters  between  them,  a  blank  should  be  inserted.  Example 
are: 

C0 1 51=  1,10  DO  15  1=1,  15 

REWIND J  should  be  written  REWIND  J 

REALAAA  REAL  AAA 

(3)  Keywords,  labels,  symbolic  names  or  constants  shculd  not 

be  split  between  two  lines. 

(2)  .  EN D  Lj.n§.  END  is  no*  considered  a  state¬ 
ment  but  is  a  type  of  line.  It  may  not  be  labelled,  executed 
or  continued.  Note  especially  that  END  is  not  an  executable 
statement  with  the  same  effect  as  RETURN  in  a  subrpregram  or 
STOP  in  the  main  program. 

(3)  •  Format  2l  Statements.  The  Standard  limits 
each  statement  to  cne  initial  line  and  not  more  than  8 
continuation  lines. 

(4)  .  Order  of  Statements .  The  following  table 
show  the  order  of  statements.  By  ’header  statement*  is  meant 
a  SUBROUTINE,  FUNCTION  or  BLOCK  DATA  statememt.  Horizontal 
lines  within  the  table  indicate  that  entities  above  the 
line  must  precede  entities  below  the  line  (if  present)  . 
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TABLE  III 

Table  of  Statement  Order 


Header  Statement 

Type  statements 

Comment 

Lines 

DIMENSION  Statements  EXTERNAL 

COMMON  Statements 

EQUIVALENCE  Statements 

DATA  Statements 

Statement  Functions 

Executable  Statements 

STOP  line 

FORMAT  Statements 

<  END  line  j 

Vertical  lines  indicate  that  the  entities  on  either  side  of 
the  line  may  be  intermingled  [Ref.  1], 

c.  Specification  Statements 

Specification  statements  are  non-executable 
statements  which  give  information  to  the  compiler.  It 
consists  of  TX  PE  (DODELE  PRECISION,  INTEGER,  REAL,  LOGICAL, 
and  COMPLEX),  DIMENSION,  COMMON,  DATA  and  EQUIVALENCE. 

d.  Transfer  cf  Control 

This  consists  of  the  GO  TO  statement.  Computed 
GO  TO  statement,  RETURN  and  STOP  statements,  Arithemetic  IF 
statement.  Logical  IF  statement,  DO  statement,  and  CONTINUE 
statement. 


e.  Input/Output 


This  consists  of  the  WRITE  statement,  READ 
statement,  ENDFILE  statement,  REWIND  statement,  EACKSPACE 
statement  and  PORMAT  statement. 

f.  Expression  and  Assignment 

This  consists  of  the  Arithemetic  Expression, 
logical  expression,  and  Assignment  statement. 

g.  Program  Units 

This  consists  of  the  Main  program.  Function 
subprograms.  Block  Data,  and  Subroutine  subprograms. 

2.  Structured  Form 

The  algorithm  language  [Ref.  18]  is  convenient  for 
representing  the  generalized  construct  structure.  So,  to 
represent  the  structured  FORTRAN  form,  it  will  be  compared 
with  the  algorithm  language.  Detail  structured  forms  are  as 
follows : 

ALGORITHM  LANGUAGE  FORTRAN  IV 

1.  ALGCFITHM 

ALGORITHM  algorithm  name  same  wirh  *C*  in  column  1 

statements 
END  algorithm., name 

2.  IP_THEN_single  statement 

IF  condition  THEN  IF  (condition)  statement 

statement 
END  IF 

3.  IF_THENjnultiple  statements 

IF  condition  THEN  IF  (.NOT.  condition)  GO  TO  10 

statements  statements 

END  IF  10  CONTINUE 

4.  IF  TEEN  ELSE  construct 


IF  condition  THEN 


IF  (.NOT.  condition)  GO  TO  5 


i 


statements  1 

ELSE 

statements  2 
END  IF 


statements  1 
GO  TO  6 

5  CONTINUE 

statement  2 

6  CONTINUE 


Multiway  selection  :  ELSE  IF 


IF  continue  1  THEN 
statement?  1 

ELSE  IF  condltlSn  2  TEEN 
statements  2“ 

ELSE  IF  condition  3  THEN 
statements  3" 

ELSE 

statements  4 
END  IF 

(ELSE  is  optional) 


6.  WHILE  repetion 

WHILE  condition  DO 
statement  s 
END  WHILE 


7.  REPEAT  repetition 

REPEAT 

statements 
UNTIL  condition 

8.  DO  FOR  repetition 

FOR  I  <-  L  TO  M  BY  N  DO 
statements 
END  FOR 

(BY  N  can  be  omitted,  in 
which  case  BY  1  is  assumed) 


IF  (.NOT.  condition  1)  GO  TO  10 
statements  1  “ 

GO  TO  20 

10  IF  (.NOT.  condition  2)  GO  TO  11 

statements  2  ~ 

GO  TO  20 

11  IF  (.NOT.  condition  3)  GO  TO  12 

statements  3 
GO  TO  20 

12  CONTINUE 

statements  4 
20  CONTINUE 


5  IF  (.NOT.  condition)  GO  TC  6 

statements 
GO  TO  5 

6  CONTINUE 


5  CONTINUE 

statements 

IF  (.NOT.  condition)  GO  TC  5 


DO  10  I  =  L,M.N 
statements 
10  CONTINUE 

(, N  can  be  omitted  in  which  case 
, 1  is  assumed) 


9.  Multiway  selection  _  CASE 


CASE  variable  OF 
1: 

statements  1 

2: 

statements  2 
3: 

statements  3 

ELSE 

statements  4 
ELSE  CASE 

(ELSE  is  optional) 


FUNCTICN 


IF  (variable. LT.1)  GO  TO  20 
IF  jvariable.GT.3)  GO  TO  20 
GO  TO  (11,12,13),  variable 

11  CONTINUE 

statements  1 
GO  TO  30 

12  CONTINUE 

statements  2 
GO  TO  30 

13  CONTINUE 

statements  3 
20  CONTINUE 

statements  4 
30  CONTINUE 


FUNCTION  function  naae(parm  1,...,  parm  n) 
statements  ~ 


function  name  expression 
END  f unctionjlame  ” 

data  type  FUNCTION  function  name(parm  1,...,parm  n) 
state  aents  “ 

function  name  =  expression 
RETURN  " 

END 


11.  PROCEDURE  (SUBROUTINE) 

PROCEDURE  procedure  name(parm  1,...,  para  n) 
statements 
END  procsdure_name 

SUBROUTINE  subroutine  name(parm  1,...,  parm  n) 
statements  “  “  ~ 

RETURN 

END 


3.  Format  Grammar 

This  grammar  represents  the  construct  format  of 
structured  FORTRAN.  It  is  a  subset  of  the  generalized  format 
structure.  The  control  structure  is  limited  to  5  structures 
-  if,  case,  while,  until,  and  do.  In  the  declaration  part, 
the  declarations  will  be  statements.  For  more  detail,  the 
grammer  figures  (Appendix  B)  can  be  referenced. 

4 .  Iaolementaticn 
a.  Limitations 

An  ADA  compiler  was  net  available  for  this  work. 
So,  the  PASCAL  programming  language  was  used  to  iaplement 
the  system.  This  i aple men tation  is  a  little  different  from 
the  design  of  the  previous  section  because  PASCAL  does  not 
support  all  the  ADA  programming  features.  In  order  to  simply 
the  implementation,  just  a  subset  of  the  system  was  imple¬ 
mented,  i.e.  the  UNTIL  construct  is  omitted. 

Also  the  implemented  system  does  not  cover  all 
standard  FORTRAN  -  it  does  not  include  some  keywords  like 
PAUSE,  REWIND  and  so  cn.  The  other  limitations  of  this  are 
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the  following:  1.  All  input  programs  should  be  syntatically 
correct  to  get  proper  indentation  and  the  level  documenta¬ 
tion.  2.  All  input  FORTRAN  programs  should  be  conform  to 
the  standard  structured  form  mentioned  in  previous  sections. 
3.  The  input  lines  should  be  short  enough  to  indent  without 
being  extended  onto  the  next  line.  That  is  the  implemented 
system  does  not  have  the  line  break  function. 

b.  Internal  Data  Structure 

0) •  Line  Data  Structure .  The  input  line  and 
output  line  are  represented  as  an  array  of  characters. 
Normally,  programming  langugages  use  30  column  per  line.  In 
actual  programs,  most  lines  do  not  use  all  of  the  columns; 
the  mean  cf  programming  line  size  is  34  [Ref.  2].  If  the 
maximum  array  is  assigned  for  one  line,  space  is  wasted.  So 
no  save  memory  and  make  the  line  flexible,  a  double  linked 
data  structure  was  used  for  the  infernal  line  structure. 
Also,  a  sentinel  node  will  be  used.  It  allows  an  easy  check 
of  an  empty  input  file. 

(2)  .  Statement  Data  Structure .  As  shown  above, 
the  relationship  of  line  and  statement  is  one  to  one  or  many 
to  one.  Clearly,  the  statement  can  be  represented  by  the 
line  data  structure.  So,  a  line  record  will  have  information 
about  statements.  Comment  statements  will  be  ignored  for 
statement  representation. 

(3)  .  Construct  Data  Structure.  The  construct 
will  have  seme  relationship  with  the  statements  e.g.  one  to 
one  fer  simple  statements,  one  to  many  for  others.  The 
statements  can  have  the  information  of  the  construct,  since 
every  construct  can  be  seperated  into  statements.  For 
example,  the  DO  construct  consist  of  DO_COND  statement, 
compound  statement  and  SND_D0  statement.  But  here,  the  line 
also  will  have  the  construct  information.  It  is  possible 
since  the  relationship  of  line  and  statement  also  one  to  one 
and  many  to  one. 
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c.  The  Program  and  Example  Input/Output 

Anyone  interested  in  obtaining  a  copy  of  the 
program  should  contact  the  author  directly  or  the  Computer 
Science  Department  at  the  Naval  Postgraduate  School, 
Monterey,  California.  The  example  input  output  can  be 
referenced  in  Appendix  C.  The  example  prograa  does  not  have 
any  meaning.  It  is  written  just  to  show  the  constructs  of 
programs  and  the  results  of  program  execution. 


VI.  CONCLUSION 


One  of  today's  software  problems  is  the  very  high  cost 
of  developing  and  maintaining  software.  Much  research  has 
been  devoted  to  solving  this  problem.  One  way  to  solve 
today's  software  crisis  is  to  study  software  tools  that  can 
help  people  who  serve  in  the  software  area. 

This  thesis  designed  and  partially  implemented  a  program 
family  of  extended  pretty  printers  that  can  help  to  solve 
software  problems  by  improving  readability  and  understand- 
ability  cf  programs. 

The  system  will  work  for  almost  any  structured  program¬ 
ming  language  and  for  various  secondary  functions  with  only 
small  changes  in  some  modules.  The  design  presented  here  is 
for  a  program  family  cf  pretty  printers.  The  program  imple¬ 
mented  here  is  one  member  in  this  family.  Other  members  of 
the  program  family  remain  to  be  implemented. 
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GENERALIZED  CONSTRUCT  FLOS  CHART 


Pigare  1.1  Prograa  Structure. 


Figure  A. 5  Compound  Statement 


Figure  A. 6  If  Stateaent 


Figure  1.7  case  statement 
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Figure  A. 8  Bhile  Stateaent. 
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Figure  A. 9  Until  Stateaent. 


00  J 


COND 


1 _ r, 

ENDDO 


3 


Figure  A. 10  Do  statenent. 
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Figure  A.  11 


Block  Statenent 
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Figure  E.1  Prograe  Structure. 
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Figure  B. 2  Subroutine. 


lianrA  B.3  Main. 


Figure  B.6  case  Stateaen 


Figure  B.  7  Shile  Stateaent. 


Figure  B.15  State  Chart  3 
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GO  TO  !  ^5  r  6  ,7) 


CONIINI 
RD 
RD 
GO 

CONTINUE 
RD  (1)  =6.  0 
RD  (2[=6.  0 
GO  TO  66  6 
7  CONTINUE 

DO  567  1=1,19 
RD  (I)  =FLOAT(I) 
567  CONTINUE 

RD  (20)  =40.0 
555  CONTINUE 
R 1=4. 9 
11=4*12 
STOP 
END 


C 

C 

C 

C 

C**** 

c 

c**** 

c 

c*** 

c 


c 

c**  * 

c 


******************************************** 

SUBROUTINE  PROGRAM 

******************************************** 

SUBROUTINE  SUB  (RD,  ID) 

DECLARATION 

REAL  R1,  R2,R3,  PC  (20) 

INTEGER  II  ,12 , 13, ID  (2  0) 

LOGICAL  LI ,L2, L3 


SIKEIE  STATEMENT 


100 


READ  (5,100) 
FORMAT^3l5) 


11,12, 13 


C 

c*** 

c*** 

c 


LI 
R  1 


SUE, 
=  1.5 


IF  STATEMENT 
CO  STATEMENT 

IF  (.NOT.  (I  1.  NE.  1)  ) 

DO  500  I  =  1 ,20 

R(I)=0.0 

CONTINUE 

RETURN 

END 


GO  TC  1 


***  END  OF  INPUT  *** 
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***"*PROgI  AM~CONSTR  OCT  *** 

IS  s  sssssssssssssss  SSSSSSSS  s 


DECLARATION 

DECIABATICN 

DECLARATION 

SIMFLE 

SIMPLE 

SIMPLE 

SIMPLE 

SIMPLE 

IF  (CCND)  THEN 
DO  (COND) 

SIMPLE 
END  DO 

ELSE  IF  (COND) 

WHILE  / CCND)  DO 
SIMPLE 


EN 

D  WHILE 

ELSE 

IF 

SI 

»^"D) 

ELSE 

SI 

MPLE 

END 

IF 

SIMPLE 

CASE 

VAR 

CC  NS 

T  : 

SI 

MPLE 

SI 

MPLE 

CC  NS 

T  : 

SI 

MPLE 

SI 

MPLE 

CCNST  : 

DO 

(CO»^ 

END  DO 
SIMPLE 


END  CASE 

SIMFLE 

SIMPLE 

STOP 

END  OF  EROGRAM 
SUBROUTINE 
DECLARATICN 
DECLARATICN 
DECLARATION 
SIMPLE 
SIMPLE 
SIMPLE 
SIMPLE 

IF  (COND)  THEN 
DO  (COND) 

SIMPLE 
END  DO 

RETURN 

END  OF  PROGRAM 
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